ADADELTA: An Adaptive Learning Rate Method

نویسنده

  • Matthew D. Zeiler
چکیده

We present a novel per-dimension learning rate method for gradient descent called ADADELTA. The method dynamically adapts over time using only first order information and has minimal computational overhead beyond vanilla stochastic gradient descent. The method requires no manual tuning of a learning rate and appears robust to noisy gradient information, different model architecture choices, various data modalities and selection of hyperparameters. We show promising results compared to other methods on the MNIST digit classification task using a single machine and on a large scale voice dataset in a distributed cluster environment.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improved End-to-End Speech Recognition Using Adaptive Per-Dimensional Learning Rate Methods

The introduction of deep neural networks (DNNs) leads to a significant improvement of the automatic speech recognition (ASR) performance. However, the whole ASR system remains sophisticated due to the dependent on the hidden Markov model (HMM). Recently, a new endto-end ASR framework, which utilizes recurrent neural networks (RNNs) to directly model context-independent targets with connectionis...

متن کامل

Hot Swapping for Online Adaptation of Optimization Hyperparameters

We describe a general framework for online adaptation of optimization hyperparameters by ‘hot swapping’ their values during learning. We investigate this approach in the context of adaptive learning rate selection using an explore-exploit strategy from the multi-armed bandit literature. Experiments on a benchmark neural network show that the hot swapping approach leads to consistently better so...

متن کامل

Cystoscopy Image Classication Using Deep Convolutional Neural Networks

In the past three decades, the use of smart methods in medical diagnostic systems has attractedthe attention of many researchers. However, no smart activity has been provided in the eld ofmedical image processing for diagnosis of bladder cancer through cystoscopy images despite the highprevalence in the world. In this paper, two well-known convolutional neural networks (CNNs) ...

متن کامل

Designing stable neural identifier based on Lyapunov method

The stability of learning rate in neural network identifiers and controllers is one of the challenging issues which attracts great interest from researchers of neural networks. This paper suggests adaptive gradient descent algorithm with stable learning laws for modified dynamic neural network (MDNN) and studies the stability of this algorithm. Also, stable learning algorithm for parameters of ...

متن کامل

Perfect Tracking of Supercavitating Non-minimum Phase Vehicles Using a New Robust and Adaptive Parameter-optimal Iterative Learning Control

In this manuscript, a new method is proposed to provide a perfect tracking of the supercavitation system based on a new two-state model. The tracking of the pitch rate and angle of attack for fin and cavitator input is of the aim. The pitch rate of the supercavitation with respect to fin angle is found as a non-minimum phase behavior. This effect reduces the speed of command pitch rate. Control...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1212.5701  شماره 

صفحات  -

تاریخ انتشار 2012